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I. Introduction 

Network information theory generally focuses on applications that, in the open systems inter- 
connection (OSI) model of network architecture, lie in the physical layer. In this context, there 
are some networked systems, such as those represented by the multiple-access channel and the 
broadcast channel, that are well understood, but there are many that remain largely intractable. 
Even some very simple networked systems, such as those represented by the relay channel and 
the interference channel, have unknown capacities. 

But the relevance of network information theory is not limited to the physical layer. In practice, 
the physical layer never provides a fully-reliable bit pipe to higher layers, and reliability then falls 
on the data link control, network, and transport layers. These layers need to provide reliability 
not only because of an unreliable physical layer, but also because of packet losses resulting from 
causes such as congestion (which leads to buffer overflows) and interference (which leads to 
collisions). Rather than coding over channel symbols, though, coding is applied over packets, i.e. 
rather than determining each node's outgoing channel symbols through arbitrary, causal mappings 
of their received symbols, the contents of each node's outgoing packets are determined through 
arbitrary, causal mappings of the contents of their received packets. Such packet-level coding 
offers an alternative domain for network information theory and an alternative opportunity for 
efficiency gains resulting from cooperation, and it is the subject of our paper. 

Packet-level coding differs from symbol-level coding in three principal ways: First, in most 
packetized systems, packets received in error are dropped, so we need to code only for resilience 
against erasures and not for noise. Second, it is acceptable to append a degree of side-information 
to packets by including it in their headers. Third, packet transmissions are not synchronized in 
the way that symbol transmissions are — in particular, it is not reasonable to assume that packet 
transmissions occur on every link in a network at identical, regular intervals. These factors make 
for a different, but related, problem to symbol-level coding. Thus, our work addresses a problem 
of importance in its own right as well as possibly having implications to network information 
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theory in its regular, symbol-level setting. 

Aside from these three principal differences, packet-level coding is simply symbol-level coding 
with packets as the symbols. Thus, given a specification of network use (i.e. packet injection 
times), a code specifies the causal mappings that nodes apply to packets to determine their 
contents; and, given a specification of erasure locations in addition to the specification of network 
use (or, simply, given packet reception times corresponding to certain injection times), we can 
define capacity as the maximum reliable rate (in packets per unit time) that can be achieved. 
Thus, when we speak of capacity, we speak of Shannon capacity as it is normally defined in 
network information theory (save with packets as the symbols). We do not speak of the various 
other notions of capacity in networking literature. 

The prevailing approach to packet-level coding uses a feedback code: Automatic repeat request 
(ARQ) is used to request the retransmission of lost packets either on a link-by-link basis, an end- 
to-end basis, or both. This approach often works well and has a sound theoretical basis: It is well 
known that, given perfect feedback, retransmission of lost packets is a capacity-achieving strategy 
for reliability on a point-to-point link (see, for example, [1, Section 8.1.5]). Thus, if achieving 
a network connection meant transmitting packets over a series of uncongested point-to-point 
links with reliable, delay-free feedback, then retransmission is clearly optimal. This situation 
is approximated in lightly-congested, highly-reliable wireline networks, but it is generally not 
the case. First, feedback may be unreliable or too slow, which is often the case in satellite or 
wireless networks or when servicing real-time applications. Second, congestion can always arise 
in packet networks; hence the need for retransmission on an end-to-end basis. But, if the links are 
unreliable enough to also require retransmission on a link-by-link basis, then the two feedback 
loops can interact in complicated, and sometimes undesirable, ways [2], [3]. Moreover, such 
end-to-end retransmission requests are not well-suited for multicast connections, where, because 
requests are sent by each terminal as packets are lost, there may be many requests, placing an 
unnecessary load on the network and possibly overwhelming the source; and packets that are 
retransmitted are often only of use to a subset of the terminals and therefore redundant to the 
remainder. Third, we may not be dealing with point-to-point links at all. Wireless networks are the 
obvious case in point. Wireless links are often treated as point-to-point links, with packets being 
routed hop-by-hop toward their destinations, but, if the lossiness of the medium is accounted for, 
this approach is sub-optimal. In general, the broadcast nature of the links should be exploited; 
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and, in this case, a great deal of feedback would be required to achieve reliable communication 
using a retransmission-based scheme. 

In this paper, therefore, we eschew this approach in favor of one that operates mainly in 
a feedforward manner. Specifically, we consider the following coding scheme: Nodes store the 
packets they receive into their memories and, whenever they have a transmission opportunity, they 
form coded packets with random linear combinations of their memory contents. This strategy, 
we shall show, is capacity-achieving, for both single unicast and single multicast connections 
and for models of both wireline and wireless networks, as long as packets received on each 
link arrive according to a process that has an average rate. Thus, packet losses on a link may 
exhibit correlation in time or with losses on other links, capturing various mechanisms for loss — 
including collisions. 

The scheme has several other attractive properties: It is decentralized, requiring no coordination 
among nodes; and it can be operated ratelessly, i.e. it can be run indefinitely until successful 
decoding (at which stage that fact is signaled to other nodes, requiring an amount of feedback 
that, compared to ARQ, is small), which is a particularly useful property in packet networks, 
where loss rates are often time-varying and not known precisely. 

Decoding can be done by matrix inversion, which is a polynomial-time procedure. Thus, 
though we speak of random coding, our work differs significantly from that of Shannon [4], 
[5] and Gallager [6] in that we do not seek to demonstrate existence. Indeed, the existence of 
capacity-achieving linear codes for the scenarios we consider already follows from the results 
of [7]. Rather, we seek to show the asymptotic rate optimality of a specific scheme that we 
believe may be practicable and that can be considered as the prototype for a family of related, 
improved schemes; for example, LT codes [8], Raptor codes [9], Online codes [10], RT oblivious 
erasure-correcting codes [11], and the greedy random scheme proposed in [12] are related coding 
schemes that apply only to specific, special networks but, using varying degrees of feedback, 
achieve lower decoding complexity or memory usage. Our work therefore brings forth a natural 
code design problem, namely to find such related, improved schemes. 

We begin by describing the coding scheme in the following section. In Section Hill we describe 
our model and illustrate it with several examples. In Section |IVl we present coding theorems 
that prove that the scheme is capacity-achieving and, in Section |Vl we strengthen these results 
in the special case of Poisson traffic with i.i.d. losses by giving error exponents. These error 
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exponents allow us to quantify the rate of decay of the probability of error with coding delay 
and to determine the parameters of importance in this decay. 

II. Coding scheme 

We suppose that, at the source node, we have K message packets wi,W2,---,wk, which 
are vectors of length A over the finite field Fg. (If the packet length is b bits, then we take 
A = \b/ log2 q] .) The message packets are initially present in the memory of the source node. 

The coding operation performed by each node is simple to describe and is the same for every 
node: Received packets are stored into the node's memory, and packets are formed for injection 
with random linear combinations of its memory contents whenever a packet injection occurs on 
an outgoing link. The coefficients of the combination are drawn uniformly from F^. 

Since all coding is linear, we can write any packet x in the network as a linear combination 
of wi, W2, . ■ ■ , wk, namely, x = J2k=i IkWk- We call 7 the global encoding vector of x, and we 
assume that it is sent along with x, as side information in its header. The overhead this incurs 
(namely, K logj q bits) is negligible if packets are sufficiently large. 

Nodes are assumed to have unlimited memory. The scheme can be modified so that received 
packets are stored into memory only if their global encoding vectors are linearly-independent of 
those already stored. This modification keeps our results unchanged while ensuring that nodes 
never need to store more than K packets. 

A sink node collects packets and, if it has K packets with linearly-independent global encoding 
vectors, it is able to recover the message packets. Decoding can be done by Gaussian elimination. 
The scheme can be run either for a predetermined duration or, in the case of rateless operation, 
until successful decoding at the sink nodes. We summarize the scheme in Figure [B 

The scheme is carried out for a single block of K message packets at the source. If the source 
has more packets to send, then the scheme is repeated with all nodes flushed of their memory 
contents. 

Similar random linear coding schemes are described in [13], [14], [15], [16] for the application 
of multicast over lossless wireline packet networks, in [17] for data dissemination, in [18] for 
data storage, and in [19] for content distribution over peer-to-peer overlay networks. Other 
coding schemes for lossy packet networks are described in [7] and [20]; the scheme described 
in the former requires placing in the packet headers side information that grows with the size of 
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Initialization: 

• The source node stores the message packets wi,W2, ■ ■ ■ , wk in its memory. 
Operation: 

• When a packet is received by a node, 

- the node stores the packet in its memory. 

• When a packet injection occurs on an outgoing link of a node, 

- the node forms the packet from a random linear combination of the packets 
in its memory. Suppose the node has L packets yi, ^2, ■ ■ ■ , Z/l in its memory. 
Then the packet formed is 

L 

X := ^aiyi, 

1=1 

where ai is chosen according to a uniform distribution over the elements of 
¥q. The packet's global encoding vector 7, which satisfies x = '^^=1 IkWk, 
is placed in its header. 
Decoding: 

• Each sink node performs Gaussian elimination on the set of global encoding 
vectors from the packets in its memory. If it is able to find an inverse, it applies 
the inverse to the packets to obtain wi, W2, . . . , wk', otherwise, a decoding error 

occurs. 

Fig. 1. Summary of the random linear coding scheme we consider. 

the network, while that described in the latter requires no side information at all, but achieves 
lower rates in general. Both of these coding schemes, moreover, operate in a block-by-block 

manner, where coded packets are sent by intermediate nodes only after decoding a block of 
received packets — a strategy that generally incurs more delay than the scheme we consider, 
where intermediate nodes perform additional coding yet do not decode [12]. 

III. Model 

Existing models used in network information theory (see, for example, [1, Section 14.10]) are 
generally conceived for symbol-level coding and, given the peculiarities of packet-level coding. 
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are not suitable for our purpose. One key difference, as we mentioned, is that packet transmissions 
are not synchronized in the way that symbol transmissions are. Thus, we do not have a slotted 
system where packets are injected on every link at every slot, and we must therefore have a 
schedule that determines when (in continuous time) and where (i.e. on which link) each packets 
is injected. In this paper, we assume that such a schedule is given, and we do not address the 
problem of determining it. This problem, of determining the schedule to use, is a difficult problem 
in its own right, especially in wireless packet networks. Various instances of the problem are 
treated in [21], [22], [23], [24], [25], [26], [27]. 

Given a schedule of packet injections, the network responds with packet receptions at certain 
nodes. The difference between wireline and wireless packet networks, in our model, is that the 
reception of any particular packet may only occur at a single node in wireline packet networks 
while, in wireless packet networks, it may occur at more than one node. 

The model, which we now formally describe, is one that we believe is an accurate abstraction of 
packet networks as they are viewed at the level of packets, given a schedule of packet injections. 
In particular, our model captures various phenomena that complicate the efficient operation of 
wireless packet networks, including interference (insofar as it is manifested as lost packets, i.e. 
as collisions), fading (again, insofar as it is manifested as lost packets), and the broadcast nature 
of the medium. 

We begin with wireline packet networks. We model a wireline packet network (or, rather, the 
portion of it devoted to the connection we wish to establish) as a directed graph Q = {J\f,A), 
where J\f is the set of nodes and A is the set of arcs. Each arc represents a lossy point- 
to-point link. Some subset of the packets injected into arc by node i are lost; the rest 
are received by node j without error. We denote by Zij the average rate at which packets are 
received on arc More precisely, suppose that the arrival of received packets on arc (i, j) 
is described by the counting process Aij, i.e. for r > 0, Aij{T) is the total number of packets 
received between time and time r on arc Then, by assumption, limT-^^^ A^j^r) / t = Zij 

a.s. We define a lossy wireline packet network as a pair {Q, z). 

We assume that links are delay-free in the sense that the arrival time of a received packet 
corresponds to the time that it was injected into the link. Links with delay can be transformed 
into delay-free links in the following way: Suppose that arc (i, j) represents a link with delay. 
The counting process Aij describes the arrival of received packets on arc (i, j), and we use the 
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counting process A'-j to describe the injection of these packets. (Hence A'-j counts a subset of the 
packets injected into arc We insert a node i' into the network and transform arc (i, j) into 

two arcs and These two arcs, and represent delay-free links where the 

arrival of received packets are described by A[j and Aij, respectively. We place the losses on arc 
onto arc so arc is lossless and node i' simply functions as a first-in first-out 

queue. It is clear that functioning as a first-in first-out queue is an optimal coding strategy for 
i' in terms of rate and complexity; hence, treating i' as a node implementing the coding scheme 
of Section |ll] only deteriorates performance and is adequate for deriving achievable connection 
rates. Thus, we can transform a link with delay and average packet reception rate Zij into two 
delay-free links in tandem with the same average packet reception rate, and it will be evident 
that this transformation does not change any of our conclusions. 

For wireless packet networks, we model the network as a directed hypergraph H = {J\f,A), 
where J\f is the set of nodes and A is the set of hyperarcs. A hypergraph is a generalization of 
a graph where generalized arcs, called hyperarcs, connect two or more nodes. Thus, a hyperarc 
is a pair {i, J), where i, the head, is an element of J\f, and J, the tail, is a non-empty subset 
of J\f. Each hyperarc {i, J) represents a lossy broadcast link. For each K C J, some disjoint 
subset of the packets injected into hyperarc (i, J) by node i are received by exactly the set of 
nodes K without error. 

We denote by ZijK the average rate at which packets, injected on hyperarc {i, J), are received 
by exactly the set of nodes K C J. More precisely, suppose that the arrival of packets that are 
injected on hyperarc (i, J) and received by all nodes in K (and no nodes in J\f\K) is described 
by the counting process Aijx- Then, by assumption, hmr^oo Aijxir) /r = Zijx a.s. We define 
a lossy wireless packet network as a pair (Ti^ z). 

A. Examples 

1) Network of independent transmission lines with non-bursty losses: We begin with a simple 
example. We consider a wireline network where each transmission line experiences losses 
independently of all other transmission lines, and the loss process on each line is non-bursty, 
i.e. it is accurately described by an i.i.d. process. 

Consider the link corresponding to arc (z, j). Suppose the loss rate on this link is eij, i.e. packets 
are lost independently with probability Sij. Suppose further that the injection of packets on arc 
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{i, j) is described by the counting process Bij and has average rate r^, i.e. hm^^oo Bij{T)/T = rij 
a.s. The parameters rij and Sij are not necessarily independent and may well be functions of 
each other. 

For the arrival of received packets, we have 

k=l 

where {X^} is a sequence of i.i.d. Bernoulli random variables with Pr(Xfc = 0) = 6ij. Therefore 
lim ^ = lim = lim ^^j]^ ^^^^^^ = (i _ ,^^.),^^, 

T^oo r r^oo T r— ►oo Bij[T) T 

which implies that 

In particular, if the injection processes for all links are identical, regular, deterministic processes 
with unit average rate (i.e. Bijij) = 1 + [tJ for all then we recover the model frequently 

used in information-theoretic analyses (for example, in [7], [20]). 

A particularly simple case arises when the injection processes are Poisson. In this case, Ajj (r) 
and Bijij) are Poisson random variables with parameters (1 — €ij)rijT and r^r, respectively. 
We shall revisit this case in Section |Vl 

2) Network of transmission lines with bursty losses: We now consider a more complicated 
example, which attempts to model bursty losses. Bursty losses arise frequently in packet networks 
because losses often result from phenomena that are time-correlated, for example, fading and 
buffer overflows. (We mention fading because a point-to-point wireless link is, for our purposes, 
essentially equivalent to a transmission line.) In the latter case, losses are also correlated across 
separate links — all links coming into a node experiencing a buffer overflow will be subjected to 
losses. 

To account for such correlations, Markov chains are often used. Fading channels, for example, 
are often modeled as finite-state Markov channels [28], [29], such as the Gilbert-Elliot channel 
[30]. In these models, a Markov chain is used to model the time evolution of the channel state, 
which governs its quality. Thus, if the channel is in a bad state for some time, a burst of errors 
or losses is likely to result. 

We therefore associate with arc a continuous-time, irreducible Markov chain whose state 
at time r is Eij[T). If Eij{T) = k, then the probability that a packet injected into (i, j) at time 
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r is lost is e-^ . Suppose that the steady-state probabilities of the chain are {t^- }k- Suppose 
further that the injection of packets on arc is described by the counting process Bij and 
that, conditioned on Eij{r) — k, this injection has average rate rlj\ Then, we obtain 

where tt^ and yij denote the column vectors with components {7^ij''}k and {(1 — s\j'')rlj''}k, 
respectively. Our conclusions are not changed if the evolutions of the Markov chains associated 
with separate arcs are correlated, such as would arise from bursty losses resulting from buffer 
overflows. 

If the injection processes are Poisson, then arrivals of received packets are described by 
Markov-modulated Poisson processes (see, for example, [31]). 

3) Slotted Aloha wireless network: We now move from wireline packet networks to wireless 
packet networks or, more precisely, from networks of point-to-point links (transmission lines) to 
networks where links may be broadcast links. 

In wireless packet networks, one of most important issues is medium access, i.e. determining 
how radio nodes share the wireless medium. One simple, yet popular, method for medium access 
control is slotted Aloha (see, for example, [32, Section 4.2]), where nodes with packets to send 
follow simple random rules to determine when they transmit. In this example, we consider a 
wireless packet network using slotted Aloha for medium access control. The example illustrates 
how a high degree of correlation in the loss processes on separate links sometimes exists. 

For the coding scheme we consider, nodes transmit whenever they are given the opportunity 
and thus effectively always have packets to send. So suppose that, in any given time slot, node 
i transmits a packet on hyperarc (i, J) with probability Qij. Let p'ijK\c probability that 

a packet transmitted on hyperarc (i, J) is received by exactly K c J given that packets are 
transmitted on hyperarcs C C ^ in the same slot. The distribution of p'^j^ic depends on many 
factors: In the simplest case, if two nodes close to each other transmit in the same time slot, then 
their transmissions interfere destructively, resulting in a collision where neither node's packet is 
received. It is also possible that simultaneous transmission does not necessarily result in collision, 
and one or more packets are received — sometimes referred to as multipacket reception capability 
[33]. It may even be the case that physical-layer cooperative schemes, such as those presented 
in [34], [35], [36], are used, where nodes that are not transmitting packets are used to assist 
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Fig. 2. The slotted Aloha relay channel. We wish to establish a unicast connection from node 1 to node 3. 



those that are. 

Let PiJK be the unconditioned probability that a packet transmitted on hyperarc {i, J) is 
received by exactly K c J. So 

PiJK = j2 p'iJKic n ^^'^ n " ^j'^) 

CCA \ij,L)eC J \{j,L)eA\C 

Hence, assuming that time slots are of unit length, we see that AijK{T) follows a binomial 
distribution and 

ZiJK = QijPiJK- 

A particular network topology of interest is shown in Figure [21 The problem of setting up a 
unicast connection from node 1 to node 3 in a slotted Aloha wireless network of this topology 
is a problem that we refer to as the slotted Aloha relay channel, in analogy to the symbol-level 
relay channel widely-studied in network information theory. The latter problem is a well-known 
open problem, while the former is, as we shall see, tractable and deals with the same issues of 
broadcast and multiple access, albeit under different assumptions. 

A case similar to that of slotted Aloha wireless networks is that of untuned radio networks, 
which are detailed in [37]. In such networks, nodes are designed to be low-cost and low-power 
by sacrificing the ability for accurate tuning of their carrier frequencies. Thus, nodes transmit 
on random frequencies, which leads to random medium access and contention. 

IV. Coding theorems 

In this section, we specify achievable rate regions for the coding scheme in various scenarios. 
The fact that the regions we specify are the largest possible (i.e. that the scheme is capacity- 
achieving) can be seen by simply noting that the rate between any source and any sink must be 
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limited by the rate at which distinct packets are received over any cut between that source and 
that sink. A formal converse can be obtained using the cut-set bound for multi-terminal networks 
(see [1, Section 14.10]). 

A. Wireline networks 

1) Unicast connections: We develop our general result for unicast connections by extending 
from some special cases. We begin with the simplest non-trivial case: that of two links in tandem 
(see Figure [3]). 

Suppose we wish to establish a connection of rate arbitrarily close to R packets per unit time 
from node 1 to node 3. Suppose further that the coding scheme is run for a total time A, from 
time until time A, and that, in this time, a total of packets is received by node 2. We call 
these packets f i, f2, . . . , f at. 

Any received packet x in the network is a linear combination of fi,f2, . . . ,WAr, so we can 
write 

TV 
n=l 

Since f„ is formed by a random linear combination of the message packets wi, W2, . • . , wk, we 
have 

K 

Vn = ^ OinkWk 
k=l 

for n = 1,2, . . . , N, where each ank is drawn from a uniform distribution over F^. Hence 

K / N \ 

k=l \n=l / 

and it follows that the kth component of the global encoding vector of x is given by 

N 

Ik = y^/3n«nfc- 

n=l 
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We call the vector (3 associated with x the auxiliary encoding vector of x, and we see that any 
node that receives [-^'(1 + £:)J or more packets with linearly-independent auxiliary encoding 
vectors has [-^'(1 + £:)J packets whose global encoding vectors collectively form a random 
yK{l + e)\ X K matrix over ¥g, with all entries chosen uniformly. If this matrix has rank K, 
then node 3 is able to recover the message packets. The probability that a random [/^(H-e)] x K 
matrix has rank K is, by a simple counting argument, ni=i+[x|i+e)j ~ V*?^)' which can be 
made arbitrarily close to 1 by taking K arbitrarily large. Therefore, to determine whether node 
3 can recover the message packets, we essentially need only to determine whether it receives 
[K{1 + e)\ or more packets with linearly-independent auxiliary encoding vectors. 

Our proof is based on tracking the propagation of what we call innovative packets. Such 
packets are innovative in the sense that they carry new, as yet unknown, information about 
f 1 , f 2, . . . , f AT to a nodeQ It turns out that the propagation of innovative packets through a network 
follows the propagation of jobs through a queueing network, for which fluid flow models give 
good approximations. We present the following argument in terms of this fluid analogy and defer 
the formal argument to Appendix II-AI 

Since the packets being received by node 2 are the packets vi,V2, ■ ■ ■ ,vn themselves, it is clear 
that every packet being received by node 2 is innovative. Thus, innovative packets arrive at node 
2 at a rate of Zi2, and this can be approximated by fluid flowing in at rate 2:12. These innovative 
packets are stored in node 2's memory, so the fluid that flows in is stored in a reservoir. 

Packets, now, are being received by node 3 at a rate of Z2s, but whether these packets are 
innovative depends on the contents of node 2's memory. If node 2 has more information about 
vi,V2, ■ ■ ■ ,vn than node 3 does, then it is highly likely that new information will be described 
to node 3 in the next packet that it receives. Otherwise, if node 2 and node 3 have the same 
degree of information about f 1, f 2, . . . , vn, then packets received by node 3 cannot possibly be 
innovative. Thus, the situation is as though fluid flows into node 3's reservoir at a rate of Z23, but 
the level of node 3's reservoir is restricted from ever exceeding that of node 2's reservoir. The 
level of node 3's reservoir, which is ultimately what we are concerned with, can equivalently be 

'Note that, although we are ultimately concerned with recovering wi,W2, ■ ■ ■ ,wk rather than vi,V2, ■ . ■ ,vn, we define 
packets to be innovative with respect to vi,V2, ■ . ■ ,vm- This serves to simplify our proof. In particular, it means that we do not 
need to very strict in our tracking of the propagation of innovative packets since the number of innovative packets required at 
the sink is only a fraction of A''. 
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Fig. 4. Fluid flow system corresponding to two-link tandem network. 

^0 ^ 

Fig. 5. A network consisting of L links in tandem. 

determined by fluid flowing out of node 2's reservoir at rate 2:23. 

We therefore see that the two-link tandem network in Figure [3] maps to the fluid flow system 
shown in Figure HI It is clear that, in this system, fluid flows into node 3's reservoir at rate 
min(2;i2, -223)- This rate determines the rate at which innovative packets — packets with new 
information about t>i, f2, . . . , f at and, therefore, with linearly-independent auxiliary encoding 
vectors — arrive at node 3. Hence the time required for node 3 to receive [-^'(1 + s)\ packets 
with linearly-independent auxiliary encoding vectors is, for large K, approximately K{1 + 
e)/min(2;i2, ,223), which implies that a connection of rate arbitrarily close to R packets per 
unit time can be established provided that 

< min(zi2,^23)- (1) 

Thus, we see that rate at which innovative packets are received by the sink corresponds to an 
achievable rate. Moreover, the right-hand side of ([U) is indeed the capacity of the two-link tandem 
network, and we therefore have the desired result for this case. 

We extend our result to another special case before considering general unicast connections: 
We consider the case of a tandem network consisting of L links and L + 1 nodes (see Figure [5]). 

This case is a straightforward extension of that of the two-link tandem network. It maps to 
the fluid flow system shown in Figure [6l In this system, it is clear that fluid flows into node 
(L + l)'s reservoir at rate mini<j<L{zj(j+i)}. Hence a connection of rate arbitrarily close to R 
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Zl2 



Z23 



Zl(L+1) 



L+l 



Fig. 6. Fluid flow system corresponding to L-link tandem network. 



packets per unit time from node 1 to node L + l can be established provided that 

R < min {2:^(^+1)}. (2) 

l<i<L 

Since the right-hand side of ^ is indeed the capacity of the L-link tandem network, we therefore 
have the desired result for this case. A formal argument is in Appendix II-BI 

We now extend our result to general unicast connections. The strategy here is simple: A 
general unicast connection can be formulated as a flow, which can be decomposed into a finite 
number of paths. Each of these paths is a tandem network, which is the case that we have just 
considered. 

Suppose that we wish to establish a connection of rate arbitrarily close to R packets per unit 
time from source node s to sink node t. Suppose further that 

R < min < > 

Q€Q{s,t) ^ 

|^(i,j)er+(Q) 

where t) is the set of all cuts between s and t, and r+((5) denotes the set of forward arcs 
of the cut Q, i.e. 

r+(g) ■.= {{i.3)^A\i^Q,3iQ}- 

Therefore, by the max-flow/min-cut theorem (see, for example, [38, Section 3.1]), there exists 
a flow vector / satisfying 




{il(ij)e^} {il(i,i)e^} 



R if i = s, 
-R if i = t, 
otherwise, 
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for all i E J\f, and 



^ fij ^ Z{j 



for all G A. We assume, without loss of generality, that / is cycle-free in the sense that 
the subgraph Q' = (J\f,A'), where A' := j) G A\fij > 0}, is acyclic. (If Q' has a cycle, then 
it can be eliminated by subtracting flow from / around it.) 

Using the conformal realization theorem (see, for example, [38, Section 1.1]), we decompose / 
into a finite set of paths {pi,P2, ■ ■ ■ ,Pm}, each carrying positive flow Rm for m = 1, 2, . . . , M, 
such that J2m=i = R- Wc treat each path p^n as a tandem network and use it to deliver 
innovative packets at rate arbitrarily close to Rm, resulting in an overall rate for innovative 
packets arriving at node t that is arbitrarily close to R. A formal argument is in Appendix II-CI 

2) Multicast connections: The result for multicast connections is, in fact, a straightforward 
extension of that for unicast connections. In this case, rather than a single sink t, we have a set 
of sinks T. As in the framework of static broadcasting (see [39], [40]), we allow sink nodes to 
operate at different rates. We suppose that sink t E T wishes to achieve rate arbitrarily close 
to Rt, i.e., to recover the K message packets, sink t wishes to wait for a time that is only 
marginally greater than K/Rt. We further suppose that 



for all t E T. Therefore, by the max-flow/min-cut theorem, there exists, for each t E T, a flow 
vector satisfying 



for aU i E M, and ^ < for aU (z, j) E A. 

For each flow vector f'^^\ we go through the same argument as that for a unicast connection, 
and we find that the probability of error at every sink node can be made arbitrarily small by 
taking K sufficiently large. 

We summarize our results regarding wireline networks with the following theorem statement. 





V 
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Theorem 1: Consider the lossy wireline packet network {G,z). The random linear coding 
scheme described in Section |Il] is capacity-achieving for multicast connections, i.e., for K 
sufficiently large, it can achieve, with arbitrarily small error probability, a multicast connection 
from source node s to sink nodes in the set T at rate arbitrarily close to Rt packets per unit 
time for each t E T if 



for all t G TO 

Remark. The capacity region is determined solely by the average rate Zij at which packets are 
received on each arc («, j). Therefore, the packet injection and loss processes, which give rise 
to the packet reception processes, can take any distribution, exhibiting arbitrary correlations, as 
long as these average rates exist. 

B. Wireless packet networks 

The wireless case is actually very similar to the wireline one. The main difference is that we 
now deal with hypergraph flows rather than regular graph flows. 

Suppose that we wish to establish a connection of rate arbitrarily close to R packets per unit 
time from source node s to sink node t. Suppose further that 



where Q{s,t) is the set of all cuts between s and t, and r+(Q) denotes the set of forward 
hyperarcs of the cut Q, i.e. 





r+(g) := {( 



i,J) eA\teQ,J\Q^il}}. 



Therefore there exists a flow vector / satisfying 



R 



ifi 





fjii — \ —R if i 



t, 



{MjJ)eA,iei} 







otherwise, 



^In earlier versions of this work [41], [42], we required the field size q of the coding scheme to approach infinity for Theorem[T] 
to hold. This requirement is in fact not necessary, and the formal arguments in Appendix U do not require it. 
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for all i e M, 




ZiJL 



(3) 



for all (z, J) E A and K C J, and fijj > for all (z, J) E A and j G J. We again decompose / 
into a finite set of paths {pi,P2, ■ ■ ■ ,Pm}, each carrying positive flow for m = 1, 2, . . . , M, 
such that J2m=i = R- SoHic carc must be taken in the interpretation of the flow and its 
path decomposition because, in a wireless transmission, the same packet may be received by 
more than one node. The details of the interpretation are in Appendix II-DI and, with it, we can 
use path to deliver innovative packets at rate arbitrarily close to Rm, yielding the following 
theorem. 

Theorem 2: Consider the lossy wireless packet network iH,z). The random linear coding 
scheme described in Section HI] is capacity-achieving for multicast connections, i.e., for K 
sufficiently large, it can achieve, with arbitrarily small error probability, a multicast connection 
from source node s to sink nodes in the set T at rate arbitrarily close to Rt packets per unit 
time for each t G T if 



We now look at the rate of decay of the probability of error pe in the coding delay A. In 
contrast to traditional error exponents where coding delay is measured in symbols, we measure 
coding delay in time units — time r = A is the time at which the sink nodes attempt to decode the 
message packets. The two methods of measuring delay are essentially equivalent when packets 
arrive in regular, deterministic intervals. 

We specialize to the case of Poisson traffic with i.i.d. losses. Hence, in the wireline case, the 
process Aij is a Poisson process with rate and, in the wireless case, the process Aij^ is 
a Poisson process with rate ZijK- Consider the unicast case for now, and suppose we wish to 
establish a connection of rate R. Let C be the supremum of all asymptotically-achievable rates. 

To derive exponentially-tight bounds on the probability of error, it is easiest to consider the 
case where the links are in fact delay-free, and the transformation, described in Section Hill for 




for all t G T. 



V. Error exponents for Poisson traffic with i.i.d. losses 
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links with delay has not be applied. The results we derive do, however, apply in the latter case. 
We begin by deriving an upper bound on the probability of error. To this end, we take a flow 
vector / from s to t of size C and, following the development in Appendix H develop a queueing 
network from it that describes the propagation of innovative packets for a given innovation order 
p. This queueing network now becomes a Jackson network. Moreover, as a consequence of 
Burke's theorem (see, for example, [43, Section 2.1]) and the fact that the queueing network is 
acyclic, the arrival and departure processes at all stations are Poisson in steady-state. 

Let \I't(m) be the arrival time of the mth innovative packet at t, and let C := (1 — q^f)C . 
When the queueing network is in steady-state, the arrival of innovative packets at t is described 

by a Poisson process of rate C . Hence we have 

1 C 
lim — logE[exp(^^t(m))] = log— — - (4) 
m^oo m C- — y 

for < C [44], [45]. If an error occurs, then fewer than [-RA] innovative packets are received 
by t by time r = A, which is equivalent to saying that \l/t([-RA]) > A. Therefore, 

Pe<Pr(^t(ri?Al) > A), 

and, using the Chernoff bound, we obtain 

Pe < ^mm ^exp {-OA + logE[exp(^^t([i?A]))]) . 

Let e be a positive real number. Then using equation (H]) we obtain, for A sufficiently large, 

C 



Pe < min exp — ^^A + RA < log 



Hence, we conclude that 



o<e<c' \ [ C - 9 

exp{-A{C' - R -R\og{C'/R)) + RAe). 

lim >c' -R-R \og{C'/R) . (5) 

A^co A 

For the lower bound, we examine a cut whose flow capacity is C. We take one such cut and 
denote it by Q*. It is clear that, if fewer than \RA~\ distinct packets are received across Q* in 
time r = A, then an error occurs. For both wireline and wireless networks, the arrival of distinct 
packets across Q* is described by a Poisson process of rate C. Thus we have 

p.>eM-CA) Y: ^ 

1=0 

/^^Ui?Al-l 
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A^oo 



(6) 



(7) 



Equation (|7]) defines the asymptotic rate of decay of the probability of error in the coding delay 
A. This asymptotic rate of decay is determined entirely by R and C. Thus, for a packet network 
with Poisson traffic and i.i.d. losses employing the coding scheme described in Section [III the 
flow capacity C of the minimum cut of the network is essentially the sole figure of merit of 
importance in determining the effectiveness of the coding scheme for large, but finite, coding 
delay. Hence, in deciding how to inject packets to support the desired connection, a sensible 
approach is to reduce our attention to this figure of merit, which is indeed the approach taken 



Extending the result from unicast connections to multicast connections is straightforward — we 
simply obtain ^ for each sink. 



We have proposed a simple random linear coding scheme for reliable communication over 
packet networks and demonstrated that it is capacity-achieving as long as packets received on 
a link arrive according to a process that has an average rate. In the special case of Poisson 
traffic with i.i.d. losses, we have given error exponents that quantify the rate of decay of the 
probability of error with coding delay. Our analysis took into account various peculiarities of 
packet-level coding that distinguish it from symbol-level coding. Thus, our work intersects both 
with information theory and networking theory and, as such, draws upon results from the two 
usually-disparate fields [46]. Whether our results have implications for particular problems in 
either field remains to be explored. 

Though we believe that the scheme may be practicable, we also believe that, through a greater 
degree of design or use of feedback, the scheme can be improved. Indeed, feedback can be readily 
employed to reduce the memory requirements of intermediate nodes by getting them to clear 
their memories of information already known to their downstream neighbors. Aside from the 



in [21]. 



VI. Conclusion 
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scheme's memory requirements, we may wish to improve its coding and decoding complexity 
and its side information overhead. We may also wish to improve its delay — a very important 
performance factor that we have not explicitly considered, largely owing to the difficulty of doing 
so. The margin for improvement is elucidated in part in [12], which analyses various packet- 
level coding schemes, including ARQ and the scheme of this paper, and assesses their delay, 
throughput, memory usage, and computational complexity for the two-link tandem network of 
Figure [31 In our search for such improved schemes, we may be aided by the existing schemes 
that we have mentioned that apply to specific, special networks. 

We should not, however, focus our attention solely on the packet-level code. The packet-level 
code and the symbol-level code collectively form a type of concatenated code, and an endeavor to 
understand the interaction of these two coding layers is worthwhile. Some work in this direction 
can be found in [47]. 
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Appendix I 
Formal arguments for main result 

Here, we give formal arguments for Theorems [D and [21 Appendices II-A[ II-B[ and II-CI give 
formal arguments for three special cases of Theorem [Tl the two-link tandem network, the L-link 
tandem network, and general unicast connections, respectively. Appendix II-DI gives a formal 
argument for Theorem [2l in the case of general unicast connections. 

A. Two-link tandem network 

We consider all packets received by node 2, namely t"!, f2, . . . , f at, to be innovative. We 
associate with node 2 the set of vectors U, which varies with time and is initially empty, i.e. 
f/(0) := 0. If packet x is received by node 2 at time r, then its auxiliary encoding vector (3 is 
added to U at time r, i.e. f/(r+) := {/?} U f/(r). 

We associate with node 3 the set of vectors W , which again varies with time and is initially 
empty. Suppose that packet x, with auxiliary encoding vector j3, is received by node 3 at time 
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r. Let p be a positive integer, which we call the innovation order. Then we say x is innovative 
if /? ^ span(H^(r)) and \U{t)\ > |H^(t)| + p - 1. If x is innovative, then (5 is added to W at 
time T. 

The definition of innovative is designed to satisfy two properties: First, we require that VF(A), 
the set of vectors in W when the scheme terminates, is linearly independent. Second, we require 
that, when a packet is received by node 3 and \U{t)\ > \W{t)\ + p — 1, it is innovative with 
high probability. The innovation order p is an arbitrary factor that ensures that the latter property 
is satisfied. 

Suppose that packet x, with auxiliary encoding vector /3, is received by node 3 at time r 
and that |C/(t)| > |VF(r)| + p — 1. Since /3 is a random linear combination of vectors in U{t), 
it follows that x is innovative with some non-trivial probability. More precisely, because (3 
is uniformly-distributed over gl^^^^)! possibilities, of which at least q\^^'^'>\ — are not in 

span(iy(r)), it follows that 

Pr(/3 i span(iy(r))) > ^-^^^ = 1 - gl^MH^WI > 1 - q'^. 

Hence x is innovative with probability at least 1 — q~P. Since we can always discard innovative 
packets, we assume that the event occurs with probability exactly 1 — q~f. If instead \U{t)\ < 
\W{t) I +p— 1, then we see that x cannot be innovative, and this remains true at least until another 
arrival occurs at node 2. Therefore, for an innovation order of p, the propagation of innovative 
packets through node 2 is described by the propagation of jobs through a single-server queueing 
station with queue size (|f/('r)| — |W('r)| — p + 1)^. 

The queueing station is serviced with probability 1 — q~'' whenever the queue is non-empty 
and a received packet arrives on arc (2,3). We can equivalently consider "candidate" packets 
that arrive with probability 1 — q~'^ whenever a received packet arrives on arc (2, 3) and say 
that the queueing station is serviced whenever the queue is non-empty and a candidate packet 
arrives on arc (2, 3). We consider all packets received on arc (1, 2) to be candidate packets. 

The system we wish to analyze, therefore, is the following simple queueing system: Jobs arrive 
at node 2 according to the arrival of received packets on arc (1,2) and, with the exception of 
the first p — 1 jobs, enter node 2's queue. The jobs in node 2's queue are serviced by the arrival 
of candidate packets on arc (2,3) and exit after being serviced. The number of jobs exiting is 
a lower bound on the number of packets with linearly-independent auxiliary encoding vectors 
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received by node 3. 

We analyze the queueing system of interest using the fluid approximation for discrete-flow 
networks (see, for example, [48], [49]). We do not explicitly account for the fact that the first 
p — 1 jobs arriving at node 2 do not enter its queue because this fact has no effect on job 
throughput. Let Bi, B, and C be the counting processes for the arrival of received packets on 
arc (1, 2), of innovative packets on arc (2, 3), and of candidate packets on arc (2, 3), respectively. 
Let Q{r) be the number of jobs queued for service at node 2 at time r. Hence Q = Bi — B. 
Let X := Bi-C and Y -.= - B. Then 



Moreover, we have 



Q = X + Y. (8) 

Q{r)dY{T) = 0, (9) 

dY{T) > 0, (10) 

g(r)>0 (11) 

F(0)=0. (12) 



and 



for all r > 0, and 



We observe now that equations (|8|)- (|T2|) give us the conditions for a Skorohod problem (see, 
for example, [48, Section 7.2]) and, by the oblique reflection mapping theorem, there is a well- 
defined, Lipschitz-continuous mapping $ such that Q = $(X). 

Let 

C{Kt) 



K 



K 

and 



Recall that A23 is the counting process for the arrival of received packets on arc (2,3). 
Therefore, C(r) is the sum of A23{t) Bernoulli-distributed random variables with parameter 
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1 — q Hence 



C{t) := lim &^\t) 

K^oo 



= limll — g P) — a.s. 

= (1 - q''')z2^T a.s., 
where the last equality follows by the assumptions of the model. Therefore 

X(r) := lim XW(r) = {z^^ - (1 - ^"^^23)^ a.s. 

K—*oo 

By the Lipschitz-continuity of $, then, it follows that Q := limi^^oo Q''^^ = ^{^)^ i-^- Q is, 
almost surely, the unique Q that satisfies, for some F, 

Q{r) = {zi2-{l-q-'')z2,)r + Y, (13) 
Q{r)dY{r) = 0, (14) 
dY{T) > 0, (15) 

and 

Q(r) > (16) 

for all r > 0, and 

F(0) = 0. (17) 

A pair {Q, Y) that satisfies (fT3l) - (fr7l) is 

Q{r) = {zu - (1 - g~0^23)+r (18) 

and 

Yir) = {zi2-{l-q~n^23rr. 

Hence Q is given by equation (fTSi) . 

Recall that node 3 can recover the message packets with high probability if it receives [/^(l + 
e)\ packets with linearly-independent auxiliary encoding vectors and that the number of jobs 
exiting the queueing system is a lower bound on the number of packets with linearly-independent 
auxiliary encoding vectors received by node 3. Therefore, node 3 can recover the message packets 
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with high probability if [K{1 + e)\ or more jobs exit the queueing system. Let u be the number 
of jobs that have exited the queueing system by time A. Then 

z/ = 5i(A)-g(A). 

Take K =\{1- q^f)ARcR/il + e)], where 0< Rc<l. Then 

hm ^ = hm EA^tl^ 

K^oc[K{l + e)\ x^oo K{l + e) 



(1 - q-P)RcR 
_ min(zi2, (1 - q~^)z23) 

(1 - q-P)RcR 
^ 1 min(zi2,2:23) ^ ^ 
Rc R 

provided that 

< min (2:12, 2:23) • (19) 

Hence, for all R satisfying (fT9l ). u > [K{1 + £)\ with probability arbitrarily close to 1 for K 
sufficiently large. The rate achieved is 

K ^ (l-q-P)R, ^ 
A - 1+e 

which can be made arbitrarily close to R by varying p, Rc, and e. 

B. L-link tandem network 

For z = 2, 3, . . . , L + 1, we associate with node i the set of vectors V^, which varies with time 
and is initially empty. We define t/ := V2 and W := Vl+i- As in the case of the two-link tandem, 
all packets received by node 2 are considered innovative and, if packet x is received by node 2 
at time r, then its auxiliary encoding vector (3 is added to U at time r. For z = 3, 4, . . . , L + 1, 
if packet x, with auxiliary encoding vector is received by node i at time r, then we say x 
is innovative if /9 ^ span(Vi(r)) and 1^^-1(7")! > |Vi(r)| + p — 1. If x is innovative, then (3 is 
added to Vi at time r. 

This definition of innovative is a straightforward extension of that in Appendix II-AI The first 
property remains the same: we continue to require that W{A) is a set of linearly-independent 
vectors. We extend the second property so that, when a packet is received by node i for any 
i = 3, 4, . . . , L + 1 and |l^i_i(r)| > |Vi(r)| + p — 1, it is innovative with high probability. 
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Take some z G {3, 4, . . . , L + 1}. Suppose that packet x, with auxiliary encoding vector [3, is 
received by node i at time r and that |Vj_i(r)| > |Vi(r)| + p — 1. Thus, the auxiliary encoding 
vector /3 is a random linear combination of vectors in some set Vq that contains \4_i(r). Hence, 
because (3 is uniformly-distributed over gl^°l possibilities, of which at least g'^"' — are not 

in span(V^(r)), it follows that 

Pr(/5 i span(\/i(r))) > ^ = 1 - > i _ ^|v.(r)|-|y,_i{r)| > ^ _ ^-p^ 

Therefore x is innovative with probability at least 1 — q^f. 

Following the argument in Appendix II-A[ we see, for alH = 2, 3, . . . , L, that the propagation 
of innovative packets through node i is described by the propagation of jobs through a single- 
server queueing station with queue size (|Vj(r)| — 1^^2+1(7")! — p + 1)^ and that the queueing 
station is serviced with probability 1 — q^f whenever the queue is non-empty and a received 
packet arrives on arc (z, z + 1). We again consider candidate packets that arrive with probability 
1 — q^f whenever a received packet arrives on arc (i, z + 1) and say that the queueing station is 
serviced whenever the queue is non-empty and a candidate packet arrives on arc + 1). 

The system we wish to analyze in this case is therefore the following simple queueing network: 
Jobs arrive at node 2 according to the arrival of received packets on arc (1,2) and, with the 
exception of the first p — 1 jobs, enter node 2's queue. For i = 2, 3, . . . , L — 1, the jobs in 
node z's queue are serviced by the arrival of candidate packets on arc (i, z + 1) and, with the 
exception of the first p — 1 jobs, enter node (z + l)'s queue after being serviced. The jobs in 
node L's queue are serviced by the arrival of candidate packets on arc {L, L+ 1) and exit after 
being serviced. The number of jobs exiting is a lower bound on the number of packets with 
linearly-independent auxiliary encoding vectors received by node L + 1. 

We again analyze the queueing network of interest using the fluid approximation for discrete- 
flow networks, and we again do not explicitly account for the fact that the first p — 1 jobs arriving 
at a queueing node do not enter its queue. Let Bi be the counting process for the arrival of 
received packets on arc (1,2). For i = 2,3,...,L, let B^, and Ci be the counting processes 
for the arrival of innovative packets and candidate packets on arc + 1), respectively. Let 
Qj(r) be the number of jobs queued for service at node i at time r. Hence, for z = 2, 3, . . . , L, 
Qi = Bi_i — Bi. Let Xi := — C,; and Fj := Ci — Bi, where Ci := Bi. Then, we obtain a 
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Skorohod problem with the following conditions: For alH = 2, 3, . . . , L, 

Qi = Xi — Yi-i + Yi. 

For all r > and i = 2,3, . . . , L, 

Q,{T)dY,{r) = 0, 
dYi{r) > 0, 

and 

Q.ir) > 0. 

For alH = 2, 3, . . . , L, 

r,(o) = 0. 

Let 

QfHr) := ^ 

and Qi := limK^oo Qi^^ for i = 2,3, . . . , L. Then the vector Q is, almost surely, the unique Q 
that satisfies, for some Y, 

. , , ( (^12 - (1 - q-')^23)r + F2(r) if z = 2, 

Qi{r)= I (20) 

1^(1 - - ^i(i+i))r + ri(r) - Fi-i(r) otherwise, 

Q,(r)rfF,(r) =0, (21) 
> 0, (22) 

and 

g,(r) > (23) 



for all r > and i = 2,3, . . . , L, and 



y,(0) = (24) 



for alH = 2, 3, . . . , L. 

A pair {Q, Y) that satisfies (I20l)-(l24l) is 



Qiir) = (min(2;i2, min {(1 - q ^)zj(^j+i)}) - (1 - g '')zi(i+i))+r (25) 

2<j<i 
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and 



Fj(r) = (min(2;i2, min {(1 - q - {I - q r. 

2<j<i 



Hence Q is given by equation (l25l) . 

The number of jobs that have exited the queueing network by time A is given by 

L 

z/ = i?i(A)-5^Q,(A). 

i=2 

Take K = \{l - q"P)ARcR/{l + e)], where < i?e < 1- Then 

lim " = lim ft(^)-S.t.9(A) 

K^oolK{l + e)\ K^oo K{l + e) 

_ min(2:i2,min2<j<L{(l - q~^)zi{i+i)}) 
~ (1 - q-P)RcR 

^ 1 mmi<i<L{zi(i+i)} ^ ^ 

Rr R 



(26) 



provided that 



R < mm {zi(^i+i)}. (27) 



Hence, for all R satisfying (l27l) . u > [K{1 + e)\ with probability arbitrarily close to 1 for K 
sufficiently large. The rate can again be made arbitrarily close to R by varying p, Rc, and e. 

C. General unicast connection 

As described in Section IIV-A.1[ we decompose the flow vector / associated with a unicast 
connection into a finite set of paths {pi,p2, • • • ,Pa/}, each carrying positive flow Rm for m = 
1, 2, . . . , M such that J2Z=i Rm = R- We now rigorously show how each path pm can be treated 
as a separate tandem network used to deliver innovative packets at rate arbitrarily close to Rm- 

Consider a single path p^. We write prn = {k, «2, • • • , where ii = s and i^^+i = 

t. For I = 2,3, ... , Lm + 1, we associate with node ii the set of vectors V'^^^'"\ which varies with 
time and is initially empty. We define U^P"'^ := V^^^""^ and W^P"'^ := V^^'+i- Suppose packet 
X, with auxiliary encoding vector P, is received by node i2 at time r. We associate with x the 
independent random variable Px, which takes the value m with probability Rm/zgi^. If Px = m, 
then we say x is innovative on path p^, and (3 is added to t/*^^'"^ at time r. Now suppose packet 
X, with auxiliary encoding vector (3, is received by node ii at time r, where / G {3, 4, . . . , Lm + 
1}. We associate with x the independent random variable Px, which takes the value m with 
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probability Rm/zi^_^i^. We say x is innovative on path if = m, /3 ^ span(V^ (r) U V\m,)> 
and ll^^L^rV)! > + P - 1, where V\™ := U'^^^^^W^p-^A) U U*i„+if/(P")(A). 

This definition of innovative is somewhat more complicated than that in Appendices II-AI 
and II-BI because we now have M paths that we wish to analyze separately. We have again 
designed the definition to satisfy two properties: First, we require that U^^iW^P"'\A) is linearly- 
independent. This is easily verified: Vectors are added to W^^'^^r) only if they are linearly 
independent of existing ones; vectors are added to W^p^\t) only if they are linearly independent 
of existing ones and ones in ^^'^^^^(A); and so on. Second, we require that, when a packet is 
received by node ii, = m, and |V^^_^^''(t)| > \Vi^^"'\t)\ + p — 1, it is innovative on path pm 
with high probability. 

Take / G {3,4, .. . , + 1}. Suppose that packet x, with auxiliary encoding vector P, is 
received by node ii at time r, that = m, and that \Vi1i\r)\ > \vI'^""\t) \ + p - 1. Thus, 
the auxiliary encoding vector P is a random linear combination of vectors in some set Vq that 
contains V^^™''(r). Hence f3 is uniformly-distributed over g'^o' possibilities, of which at least 

givbl _ qd are not in span(V;^^'"^(r) U V\^), where d := dim(span(\/o) n span(V;^^™''(r) U V\„)). 
We have 

d = dim(span(K))) + dim(span(V;^^'" V) U V\m)) - dim(span(K) U V'/^'"^) U V\„)) 

< dim(span(K) \ '^-TV))) + dim(span(V^5"^(r))) + dim(span(V^^^'" V) U V\m)) 

- dim(span(\/o U V^^^'-V) U V\„)) 

< dim(span(K) \ '^LTV))) + dim(span(V;L''r V))) + dim(span(\//^'"^(r) U V\™)) 

- dim(span(V^LTV) U V^^^^V) U V\„)). 

Since VJ^^V) U V\m and Vi^'''"'\t) U V\„ both form linearly-independent sets, 

dim(span(V;L^'"V))) + dim(span(V^^^'"V) U V\„)) 

= dim(span(V^^i';"V))) + dim(span(V"/^'" V))) + dim(span(V\„)) 
= dim(span(V^^^'"V))) + dim(span(Vjf!'™ V) U V\^)). 
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Hence it follows that 

d < dim(span(K) \ V^^rV))) + dim(span(V^^^'" V))) + dim(span(V;^™V) U V\™)) 
- dim(span(V;L^i"^V) U v/^-^^) U V\™)) 

< dim(span(K) \ VJ^r^))) + dim(span(V;^^'" V))) 
<|K,\y,LT^(r)| + |V^(^-)(r)| 

= lK,|-|l^LTHr)| + |V^(^-)(r)|, 
which yields 

d~\Vo\<\V,^'-\T)\-\V,^'-\r)\<~p. 



Therefore 



Pt{(3 i span(V^(^™V) U V\„)) > ^ , ^ = 1 - g'^-l^"! > 1 - g-^. 



glV-ol 

We see then that, if we consider only those packets such that P^, = m, the conditions that 
govern the propagation of innovative packets are exactly those of an L^-link tandem network, 
which we dealt with in Appendix II-BI By recalling the distribution of P^,, it follows that the 
propagation of innovative packets along path behaves like an L^-link tandem network with 
average arrival rate Rm on every link. Since we have assumed nothing special about m, this 
statement applies for all m = 1, 2, . . . , M. 

Take K = \{\ - q~f)ARcR/{l + e)], where < P^ < 1- Then, by equation dlS), 

|Py(^-)(A)| Rrn 

K-.oo[K{l + e)\ R' 

Hence 

I U^^, W(^-\A)\ ^ |iy(^-)(A)| A P^ 

i^^-^oo [K{1 + e)\ [K{1 + e)\ R ' 

As before, the rate can be made arbitrarily close to P by varying p, Pc, and e. 

D. Wireless packet networks 

The constraint ^ can also be written as 

fijj < ^ OiiJL^iJL 
{LcJ\jeL} 
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for all (z, J) e A and j e J, where Y.jeL ^ijL = ^ J) e A and L C J, and a\j^ > 

for all {i, J) E A, L C J, and j G L. Suppose packet x is placed on hyperarc (i, J) and received 
by C J at time r. We associate with a; the independent random variable P^, which takes the 
value m with probability Rmalj^x/ ^{LcJ\j€L} '^ijL^iJL' where j is the outward neighbor of i 
on pm- Using this definition of in place of that used in Appendix II-CI in the case of wireline 
packet networks, we find that the two cases become identical, with the propagation of innovative 
packets along each path p„i behaving like a tandem network with average arrival rate Rm on 
every link. 
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